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Abstract 

The paper presents a new method of trend extraction in the frame- 
work of the Singular Spectrum Analysis (SSA) approach. This method 
is easy to use, does not need specification of models of time series and 
trend, allows to extract trend in the presence of noise and oscillations 
and has only two parameters (besides basic SSA parameter called win- 
dow length) . One parameter manages scale of the extracted trend and 
another is a method specific threshold value. We propose procedures 
for the choice of the parameters. The presented method is evaluated on 
a simulated time series with a polynomial trend and an oscillating com- 
ponent with unknown period and on the seasonally adjusted monthly 
data of unemployment level in Alaska for the period 1976/01-2006/09. 

Keywords: Time series; trend extraction; Singular Spectrum Analysis. 

1 INTRODUCTION 

Trend extraction is an important task in applied time series analysis, in 
particular in economics and engineering. We present a new method of trend 
extraction in the framework of the Singular Spectrum Analysis approach. 

Trend is usually defined as a smooth additive component containing 
information about time series global change. This definition is rather vague 
(which type of smoothness is used? which kind of information is contained 
in the trend?). It may sound strange, but there is no more precise definition 
of the trend accepted by the majority of researchers and practitioners. Each 
approach to trend extraction defines trend with respect to the mathematical 
tools used (e.g. using Fourier transformation or derivatives). Thus in the 
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corresponding literature one can find various specific definitions of the trend. 
For further discussion on trend issues we refer to |2]. 

Singular Spectrum Analysis (SSA) is a general approach to time series 
analysis and forecast. Algorithm of SSA is similar to that of Principal Com- 
ponents Analysis (PCA) of multivariate data. In contrast to PCA which is 
applied to a matrix, SSA is applied to a time series and provides a repre- 
sentation of the given time series in terms of eigenvalues and eigenvectors 
of a matrix made of the time series. The basic idea of SSA has been pro- 
posed by [5] for dimension calculation and reconstruction of attractors of 
dynamical systems, see historical reviews in [10] and in [11]. In this paper 
we mostly follow the notations of [llj . 

SSA can be used for a wide range of tasks: trend or quasi-periodic com- 
ponent detection and extraction, denoising, forecasting, change-point detec- 
tion. The present bibliography on SSA includes two monographes, several 
book chapters, and over a hundred papers. For more details see references at 
the website SSAwiki: http://www.math.uni-bremen.de/~theodore/ssawiki. 

The method presented in this paper has been first proposed in [3] and is 
studied in detail in the author's unpublished Ph.D. thesis [1] available only 
in Russian at http://www.pdmi.ras.ru/~theo/autossa. 

The proposed method is easy to use (has only two parameters), does 
not need specification of models of time series and trend, allows one to 
specify desired trend scale, and extracts trend in the presence of noise and 
oscillations. 

The outline of this paper is as follows. Section [2] introduces SSA, formu- 
lates properties of trends in SSA and presents the already existing methods 
of trend extraction in SSA. Section [3] proposes our method of trend extrac- 
tion. In section[3]we discuss the frequency properties of additive components 
of a time series and present our procedure for the choice of first parameter of 
the method, a low-frequency boundary. Section [5] starts with investigation 
of the role of the second method parameter, the low-frequency contribution, 
based on a simulation example. Then we propose a heuristic strategy for the 
choice of this parameter. In section [6l applications of the proposed method 
to a simulated time series with a polynomial trend and oscillations and on 
the unemployment level in Alaska are considered. Finally, section [7] offers 
conclusions. 
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2 SINGULAR SPECTRUM ANALYSIS 



Let us have a time series F = (/o, . . . , /n-i), fn G of length A^, and we 
are looking for some specific additive component of F (e.g. a trend). The 
central idea of SSA is to embed F into high-dimensional euclidean space, 
then find a subspace corresponding to the sought-for component and, finally, 
reconstruct a time series component corresponding to this subspace. The 
choice of the subspace is a crucial question in SSA. The basic SSA algorithm 
consists of decomposition of a time series and reconstruction of a desired 
additive component. These two steps are summarized below; for a detailed 
description, see page 16 of [llj. 

Decomposition. The decomposition takes a time series of length and 
comes up with an L x K matrix. This stage starts by defining a parameter 
L {1 < L < N), called the window length, and constructing the so-called 
trajectory matrix X G M^^^, K = N — L + 1, with stepwise taken portions 
of the original time series F as columns: 

F = (/o, . . . , fN-i) -^X=[Xi: ...:Xk], Xj = {fj_i, /,+l_2)^. (1) 

Note that X is a Hankel matrix and ([T]) defines one-to-one correspondence 
between series of length and Hankel matrices of size LxK. Then Singular 
Value Decomposition (SVD) of X is applied, where j-th component of SVD 
is specified by j-th eigenvalue Xj and eigenvector Uj of XX^: 

d 

X = ^ y/X'jUjVj^, Vj = X.^Uj j ^/X], d = max{j : Xj > 0}. 
i=i 

Since the matrix XX"'" is positive-definite, their eigenvalues Xj are positive. 
The SVD components are numbered in the decreasing order of eigenvalues 
Xj. We define j-th Empirical Orthogonal Function (EOF) as the sequence 
of elements of the j-th eigenvector Uj . The triple ( Uj ,Vj) is called j-th 
eigentriple, is called the j-th singular value, Uj is the j-th left singular 
vector and Vj is the j-th right singular vector. 

Reconstruction. Reconstruction goes from an L x K matrix into a 
time series of length N. This stage combines (i) selection of a subgroup 
c7 C {1, . . . , L} of SVD components; (ii) hankelization (averaging along en- 
tries with indices i + j = const) of the LxK matrix from the selected J 
components of the SVD; (iii) reconstruction of a time series component of 
length N from the Hankel matrix by the mentioned one-to-one correspon- 
dence (like in ([T]) but in the reverse direction, see below the exact formulae). 
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The result of the reconstruction stage is a time series additive component: 
= y^jUjVj'^ G = {go,. . . ,gN-i). 

For the sake of brevity, let us describe the hankelization of the matrix 
and the subsequent reconstruction of a time series component G as 
being applied to a matrix Y = {yijY~^i~^ as it is introduced in [11]. First 
we introduce L* = min{L,Er}, K* = max{L, i^} and define an L* x K* 
matrix Y* as given by Y* = Y if L ^ /sT and Y* = Y"^ if L > K. Then 
the elements of the time series G = {gQ, . . . ,gN-i) formed from the matrix 
Y are calculated by averaging along cross-diagonals of matrix Y* as 

n+l 

;i+T ^ ym,n-m+2i ^ n < L* — 1, 
m=l 

E y;;„+2, L*-l^n<K*, (2) 

m=l 
^ N-K''+l 

TT^ y*m,n-m+2^ ^* ^ n < N. 

m=n-K*+2 

Changing the window length parameter and, what is more important, 
the subgroup J of SVD components used for reconstruction, one can change 
the output time series G. In the problem of trend extraction, we are looking 
for G approximating a trend of a time series. Thus, the trend extraction 
problem in SSA is reduced to (i) the choice of a window length L used for 
decomposition and (ii) the selection of a subgroup J of SVD components 
used for reconstruction. The first problem is thoroughly discussed in section 
1.6 of [llj. In this paper, we propose a solution for the second problem. 

Note that for the reconstruction of a time series component, SSA consid- 
ers the whole time series, as its algorithm uses SVD of the trajectory matrix 
built from all parts of the time series. Therefore, SSA is not a local method 
in contrast to a linear filtering or wavelet methods. On the other hand, this 
property makes SSA robust to outliers, see [11] for more details. 

An essential disadvantage of SSA is its computational complexity for the 
calculation of SVD. This shortcoming can be reduced by using modern ^ 
and parallel algorithms for SVD. Moreover, for trend revision in case of 
receiving new data points, a computationally attractive algorithm of [12] for 
updating SVD can be used. 

It is worth to mention here that the similar ideas of using SVD of the tra- 
jectory matrix have been proposed in other areas, e.g. in signal extraction in 
oceanology [8j and estimation of parameters of damped complex exponential 
signals [13] . 



4 



2.1 Trend in SSA 



SSA is a nonparametric approach which does not need a priori specification 
of models of time series and trend, neither deterministic nor stochastic ones. 
The classes of trends and residuals which can be successfully separated by 
SSA are characterized as follows. 

First, since we extract any trend by selecting a subgroup of all d SVD 
components, this trend should generate less than d SVD components. For 
an infinite time series, a class of such trends coincides with the class of time 
series governed by finite difference equations [TI] . This class can be described 
explicitly as linear combinations of products of polynomials, exponentials 
and sines [6]. An element of this class suits well for representation of a 
smooth and slow varying trend. 

Second, a residual should belong to a class of time series which can be 
separated from a trend. The separability theory due to [14j helps us deter- 
mine this class. In [14] it was proved that (i) any deterministic function can 
be asymptotically separated from any ergodic stochastic noise as the time 
series length and window length tend to infinity; (ii) under some conditions 
any trend can be separated from any quasi-periodic component, see also [IT]. 
These properties of SSA make this approach feasible for trend extraction in 
the presence of noise and quasi-periodic oscillating components. 

Finally, as trend is a smooth and slow varying time series component, it 
generates SVD components with smooth and slow varying EOFs. Eigenvec- 
tors represent an orthonormal basis of a trajectory vector space spanned on 
the columns of trajectory matrix. Thus each EOF is a linear combination 
of portions of the corresponding time series and inherits its global smooth- 
ness properties. This idea is considered in detail in [llj for the cases of 
polynomial and exponential trends. 

2.2 Existing methods of trend extraction in SSA 

A naive approach to trend extraction in SSA is to reconstruct a trend from 
several first SVD components. Despite its simplicity, this approach works 
in many real-life cases for the following reason. An eigenvalue represents a 
contribution of the corresponding SVD component into the form of the time 
series, see section 1.6 of [H]. Since a trend usually characterizes the shape 
of a time series, its eigenvalues are larger than the other ones, that implies 
small order numbers of the trend SVD components. However, the selection 
procedure fails when the values of a trend are small enough as compared 
with a residual, or when a trend has a complicated structure (e.g. a high- 
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order polynomial) and is characterized by many (not only by the first ones) 
SVD components. 

A smarter way of selecting trend SVD components is to choose the com- 
ponents with smooth and slow varying EOFs (we have explained this fact 
above). At present, there exist only one parametric method of [15] which 
follows this approach. In [15J it was proposed using the Kendall correla- 
tion coefficient for testing for monotonic growth of an EOF. Unfortunately, 
this method is far from perfect since it is not possible to establish which 
kinds of trend can be extracted by its means. This method seems to be 
aimed at extraction of monotonic trends because their EOFs are usually 
monotonic. However, even a monotonic trend can produce non-monotonic 
EOF, especially in case of noisy observations. An example could be a linear 
trend which generates a linear and a constant EOFs. If there is a noise or 
another time series component added, then this component is often mixed 
with trend components corrupting its EOFs. Then, even in case of very 
small corruption, the constant EOF can be highly non monotonic. Nat- 
urally, the method using the Kendall correlation coefficient does not suit 
for non monotonic trends producing non monotonic EOFs. For example, 
a polynomial of low order which is often used for trend modelling usually 
produces non monotonic EOFs, for details see e.g. [H]. 



3 PROPOSED METHOD FOR TREND EXTRAC- 
TION 

In this section, we present our method of trend extraction. First, follow- 
ing [Tl], we introduce the periodogram of a time series. 

Let us consider the Fourier representation of the elements of a time series 
X of length N, X = {xq, . . . ,X]\f-i), see e.g. section 7.3 of [7j: 

Xn = co+ ^ (cfc cos{2imk/N) + Sk sm{27rnk/N)) + (-l)"cjv/2! 

where A;GN, O^n^A^ — 1, and Cjv/2 = if A^ is an odd number. Then 
the periodogram of X at the frequencies co G {k / N}\^^^^ is defined as 



24, k = 0, 

4 + si 0<k<N/2, (3) 
2c^^2' if ^ even number and k = N/2. 
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Note that this periodogram is different from the periodogram usually 
used in spectral analysis, see e.g. [1] or 0. To show this difference, let us 
denote the k-th element of the discrete Fourier transform of X as 



N-l 



n=0 

then the periodogram /^(w) at the frequencies uj G {k/N}^^^ is calculated 
as 



\\rk{X)f , if A; = or iV is even and k = N/2. 

One can see that in addition to the normalization different from that in [4] 
and [7], the values for frequencies in the interval (0,0.5) are multiplied by 
two. This is done to ensure the following property: 

N-l [_N/2\ 
11^112 = E = E Ixik/N). (4) 

n=0 A:=0 

Let US introduce the cumulative contribution of the frequencies [0, lv] as 
^x(^) = Efc:0<fc/iV<c.^x(^/^)> ^ e [0,0.5]. Then, for a given uj^ G (0,0.5), 
we define the contribution of low frequencies from the interval [0,a;o] to 
X G as 

C(X,u;o) =vrf(c^o)Ax(0.5). (5) 

Then, given parameters ujq G (0,0.5) and Co G [0,1], we propose to select 
those SVD components whose eigenvectors satisfy the following criterion: 

C([/j,u;o) ^Co, (6) 

where Uj is the corresponding j-th eigenvector. One may interpret this 
method as selection of SVD components with EOFs mostly characterized 
by low-frequency fluctuations. It is worth noting here that when we apply 
C, vr or / (defined above for a time series) to a vector, they are simply applied 
to a series of elements of the vector. 

Having the trend SVD components selected using ([6]), one reconstructs 
the trend according to section [2l The question is how to select ujq and how 
to define the threshold Cq. These issues are discussed in sections H] and [5l 
respectively. 
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4 THE LOW-FREQUENCY BOUNDARY uq 



The low-frequency boundary ujq manages the scale of the extracted trend: 
the lower is loq, the slower varies the extracted trend. Selection of ujq can be 
done a priori based on additional information about the data thus prespec- 
ifying the desired scale of the trend. 

For example, if we assume to have a quasi-periodic component with 
known period T, then we should select ujq < 1/T in order not to include 
this component in the trend. For extraction of a trend of monthly data with 
possible seasonal oscillations of period 12, we suggest to select loq < 1/12, 
e.g. ujo = 0.075. 

In this paper we also propose a method of selection of ujq considering 
a time series periodogram. Since a trend is a slow varying component, its 
periodogram has large values close to zero frequency and small values for 
other frequencies. The problem of selecting cjq is the problem of finding 
such a low-frequency value that the frequencies corresponding to the large 
trend periodogram values are inside the interval [0,u;o]- At the same time, 
ujQ cannot be too large because then an oscillating component with a fre- 
quency less than loq can be included in the trend produced. Considering the 
periodogram of a trend, we could find the proper value of loq but for a given 
time series its trend is unknown. 

What we propose is to choose (Jq based on the periodogram of the original 
time series. The following proposition substantiates this approach. 

Proposition 4.1. Let us have two time series G = {go, . . . , qn-i) (^nd H = 
{ho, . . . , /lAT-i) of length N, then for each k: < k < [N/2\ the following 
inequality holds: 



We denote as Ck,x and Sk,x the coefficients of Fourier representation of a time 
series X used in the periodogram definition ([3]). Then, by this definition. 



Since Ck,G+H = jq^^k{G+H) = Ck^c+Ck^n (where SRz denotes an imaginary 
part of a complex number z) and, analogously, Sk^c+H = Sk,G + Sk,H^ we 
have 



lG+H{k/N) - IB {k/N) - (k/N) = N {ck,GCk,H + Sk,HSk,H) • (8) 




^S^j,{k/N)-lg{k/N)-lS{k/N) = 



_^W2 |2 2 2 2 2 

— 17 \^k,G+H "T Sk,G+H ~ (^k,G ~ ^k,G ~ '^k,H ~ Sk,H, 
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Let us consider the periodograms multiplication used in the right part of 

i\r2 

{k/N)I§ {k/N) = — {cla + 4,g) {cIh + sIh) ■ (9) 

Since for all real a, b, c, and d it holds that (a^ + 6^)(c^ + = (|ac| + + 
{\ad\ - |6c|)2, then 

lQ{k/N)lH{k/N) = — (|cfc,GCfc,H| + \sk,GSk,H\f+{\ck,GSk,H\ - \ck,HSk,G\f ■ 

(10) 

Finally, taking the square of (l8|), dividing it by four and taking into ac- 
count (fTO]l . we have 

i(/^+^(fc/iV) - I^{k/N) - iSik/N))' = 

2 2 
= — {Ck,GCk,H + Sk,GSk,H) ^ — {\Ck,GCk,H\ + \Sk,GSk,H\) ^ 

< — {\Ck,GCk,H\ + \sk,GSk,H\) + {\ck,GSk,H\ - \ck,HSk,G\) = 

= I^{k/N)I^{k/N) 



and the inequality in ^ holds < /c < N/2. 

Second, we consider the case when A; = or A; = N/2. Again, by the 
definition of the periodogram 

2^I^{k/N)q{k/N) = 2,/n^4^ = 2N \ck,GCk,H\ ■ 
At the same time, 

\I^+Hik/N)-I^ ik/N)-lS{k/N)\ = N \cla+H - cIg - cIh\ = N \2ck,GCk,H 
which leads for A; = or = N/2 to 

\l^Mk/N)-I^{k/N)-lfi{k/N)\ = 2^I^{k/N)I^{k/N) 
and the result in ([7]) holds with equality. 

□ 

Corollary 4.2. Let us define for a time series F of length N the frequency 
support of the periodogram Ip as a subset of frequencies {k/N}^^^^ such 
that Ip{k' /N) > for k' /N from this subset. If the frequency supports of two 
time series G and H are disjoint then lQ^u{k/N) = lQ{k/N) + I^{k/N) 
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Let us demonstrate that when supports of periodograms of time series 
G and H are nearly disjoint, the periodogram of the sum G + H is close to 
the sum of their periodograms. 

The fact that the periodograms of G and H are very different at k/N 
can be expressed as 

/^(fc/iV)//^(A:/iV)=d»l, 

since without loss of generality we can assume lQ{k/N) > I^{k/N). Then 
using Proposition 14.11 we have that 

\I^+H{k/N)-I^{k/N)-I§{k/N)\ ^ 

2^I^{k/N)I§{k/N) = -^Ig {k/N) « (k/N), 

that means that the difference \l^^jf{k/N) - {k/N) - I^{k/N)\ is sig- 
nificantly smaller than the value of the largest periodogram (of Iq , I^) at 
the point k/N . 

In many applications, the given time series can be modelled as made 
of a trend with large periodogram values at low- frequency interval [0,tt;o], 
oscillations with periods smaller than I/wq, and noise whose frequency con- 
tribution spreads over all the frequencies [0, 0.5] but is relatively small. In 
this case the periodogram supports of the trend and the residual can be 
considered as nearly disjoint. Therefore, from Corollary 14.21 we conclude 
that the periodogram of the time series is approximately equal to the sum 
of the periodograms of the trend, oscillations and noise. 

For a time series X of length A^, we propose to select the value of the 
parameter loq according to the following rule: 

u;o = max {fc/iV : /i^(0), . . .,I^{k/N) < M#}, (11) 

where is the median of the values of periodogram of X. The mod- 
elling of a time series as a sum of a trend, oscillations and a noise (let us 
suppose to have a normal noise) motivates this rule as follows. Since the 
frequency supports of the trend and oscillating components do not overlap, 
only the values of the noise periodogram can mix with the values of the 
trend periodogram. First, the values of the noise periodogram for neighbor- 
ing ordinates are asymptotically independent (see e.g. section 7.3.2 of [7]). 
Second, supposing a relatively long time series and narrow frequency sup- 
ports of trend and oscillating components, the median of values of the time 
series periodogram gives an estimation of the median of the values of the 
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noise periodogram. Since a trend is supposed to have large contribution to 
the shape of the time series (i.e. a large L2-norm) compared to the noise 
and its frequency support is quite narrow compared to the whole interval 
[0,0.5], its periodogram values are relatively larger than the median of the 
noise periodogram values due to (jl]). Therefore, the condition used in (fTT|) 
is fulfilled only for such a frequency ojq that the trend periodogram values is 
close to zero (outside the trend frequency interval). Large noise periodogram 
values in this frequency region can lead to selecting larger than necessary ujq. 
But remember that we compare the periodogram values with their median 
and the noise periodogram values are independent (asymptotically). Hence, 
with probability approximately equal to 1 — 0.5™ (e.g. this value is equal to 
0.9375 for m = 4) we select the m-th point (of the grid {k/N}) located to 
the right side of the trend frequency interval (where the trend peridogram 
values are larger then the noised periodogram median). 

Note that the lengths of the time series and L of eigenvector are 
different {L < N) which causes different resolution of their periodograms. 
Having estimated loq after consideration of the periodogram of the original 
time series, one should select 

u;'o=\Lu;o]/L. (12) 

Dependence of ujq on the time series resolution. Let us define the 
resolution p of the original time series as p = (t^+i — t„)~-^, where r„ is 
the time of n-th measurement. If one have estimated ujq for the data with 
resolution p and there comes the same data but measured with higher res- 
olution p' = mp [m G M) thus increasing the data length in m times, then 
in order to extract the same trend, one should take the new threshold value 
Wq = u)Q/m. In a similar manner, after decimation of the data reducing the 
resolution in m times, the value uj'q = mujo should be taken. 

Example 4.3 (The choice of for a noised exponential trend). Let us 

consider an example of selection of the treshold loq for an exponential trend 
and a white Gaussian noise which also demonstrates Proposition \4-l\ Let 
the time series F = G + H be of length N = 120, where the components 
G and H are defined as Qn = Ae^'^"^"" , h^ = Bsn, £n ~ iidN{(),l) and 
A, B are selected so that ||G|p = = ^n=Q d-r^ ~ "12^=0 = 1- The 

normalization is done to ensure that YI^o-^g W-^) ~ Sfc=o-^^(^/^) ~ ^• 
Figure {1\ shows a) the simulated time series F, b) its components, c) the 
periodograms of the components, d) the periodograms zoomed together with 
a line corresponding to the median of the noise periodogram values equal 
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to 0.0126, e) the periodogram Ip of F and a kind of "confidence" interval 
of its estimation Iq + calculated according Proposition \4-l\ md a line 
corresponding to the median of the time series periodogram values 

(used for estimating loq), and f) the discrepancy, the difference between Ip 
and Iq + 1^ together with the values of this difference estimated in the right 
side of Note tha the median of the periodogram values of F is equal to 
0.0141, which is close to the median of the noise periodogram values equal 
to 0.0126. The value of loq estimated according to the proposed rule Ul\) is 
equal to 6/120 = 0.05. 



a) Original time series b) Time series components c) Periodograms of 
F ^ G + H G and H components G and H 




d) Periodograms of e) Periodogram of F and 

G and H (zoomed) its estimates Discrepancy 




Figure 1: The choice of for an exponential trend and Gaussian noise; The 
value C{u)) used in the legends is equal to 2\ Iq{ijj)I^{oj). 



5 THE LOW-FREQUENCY CONTRIBUTION Cc 

Before suggesting a procedure for selection of the second parameter of the 
proposed method, the low- frequency threshold Co, we investigate the effect 
of the choice of Cq on the quality of the trend extracted. For this aim, we con- 
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sider a time series model with a trend that generates SVD components with 
known numbers. Then, for a sufficient number of simulated time series, we 
compare our trend extraction procedure with a SSA-based procedure which 
simply reconstructs the trend using the known trend SVD components. 

5.1 A simulation example: an exponential trend plus a Gaus- 
sian noise 

The model considered is the same as in example above. Let the time series 
F = (/o, . . . , /n-i) consist of an exponential trend t„ plus a Gaussian white 
noise r„: 

fn = tn + Tn, tn = e"", r„ = ae""e„, £n ~ iidNiO, 1). (13) 

According to [llj, for such a time series with moderate noise the first SVD 
component corresponds to the trend. We considered only the noise lev- 
els when this is true (empirically checked). Note that the noise r„ has a 
multiplicative model as its standard deviation is proportional to the trend. 

In the following, we consider the following properties. First, we calculate 
the difference between the trend in (Co) resulted from our method with Co 
used and the reconstruction tn of the first SVD component exploiting the 
weighted mean square error (MSE) because this measure is more relevant 
for a model with a multiplicative noise than a simple MSE: 

N-l 

V{in{Co),tn) = ^ E e~'""(in(Co) - inf. (U) 
n=0 

This measure compares our trend and the ideal SSA trend. Second, we 
calculate the weighted mean square errors between t„(Co)i *n and the true 
trend tn separately: 

^ N-l N-l 

niniCo)) = ]^ E - in{Co))\ V{in) = ^ E (^n " . 

n=0 n=0 

(15) 

5.1.1 Scheme of estimation of the errors using simulation 

The errors (114j) . (llSp are estimated using the following scheme. We simulate 
S realizations of the time series F according to the model (jl3p and calculate 
the mean of P(t„(Co),tn) for all values of Co from the large grid 0:0.01:1: 

_ 1 ^ 

V{in{Colin) = -E^(4^^(Co),4^)), (16) 
s=l 
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where tn (Co) and in denote trends of the s-th simulated time series. The 
mean errors P(t„(Co)), T>{tn) between the true trend t„ and the extracted 
trends tn(Co) and i„, respectively, are calculated similarly. Let us also denote 
the minimal values of the mean errors as 

^™"(tn, tn) = minP(t„(Co), t„), P"'"(t„) = minP(t„(Co)) (17) 
Co Co 

and the value of Cq providing the minimal mean error between the extracted 
trend and the ideal SSA trend as 

= argmm'D{in{Co),in), 
Co 

SO that P™"(4,t„) = P(t„(C°^*),t„). 

The simulated time series are of length = 47. In order to achieve the 
best separability [11] we have selected the SSA window length L = [A/2] = 
24. The estimates of the mean errors are calculated on S" = 10^ realizations 
of the time series. 

We consider different values of the model parameters a and a. The 
values of a are (corresponding to a constant trend), 0.01 and 0.02 which 
correspond to the increase of trend values (from Iq to tiv-i) in 1, 1.6 and 2.5 
times, respectively. The levels of noise are 0.2 ^ cr ^ 1.6. It was empirically 
checked that for such levels of noise the first SVD component corresponds 
to the trend. 

Moreover, we estimated the probability of the type I error of not selecting 
the first SVD component as the ratio of times when the first component is 
not identified as a trend component by our procedure to the number of 
repetitions S. 

Choice of ajQ. In order to select the low- frequency threshold ujq, we con- 
sidered several simulated time series with different a and the maximal noise 
a = 1.6. Two examples of their periodograms for a = and a = 0.02 
are depicted in Figure [2l The median values for the periodograms depicted 
in Figure [2] are 2.936 and 2.924 which leads to = for a = and 
Wo = [1/A^ ■ L]/L = 1/24 ~ 0.042 for a = 0.02 estimated using We 
decided to take the same ujq = 0.042 (the largest one) for all a considered. 

5.1.2 Simulation results 

Figure [3] shows the evolution of the square roots of the mean errors and Cq^* 
as a function of a. The values a = and a = 0.02 are used. The square roots 
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Periodograms of the simulated time series 
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-Ip{u), Q = 

■lFh),a = 0.02 



this line corresponds to 
the medians of the time series 
periodogram values, equal to 
2.936 for a=0 and 
2.924 for a=0.02 




Figure 2: The periodograms of two time series of the model ()13p with a = 1.6 
and a = 0,0.02. 

of the mean errors (i.e. standard deviations) are taken for better comparison 
with a which is the standard deviation multipher of the noise. 

The plots of the minimal mean error and the optimal Cq^* 

for a = 0.02 are depicted in Figure El where the values for a = are also 
shown in gray color. The estimates for a = 0.01 are not reported here. 

The interpretation of the produced results is as follows. First, the trend 
extracted with the optimal Cq is very similar to the ideal SSA trend, recon- 
structed by the first SVD component since V {tn,tn) ^ ^ (in) (the 
error between our trend and the ideal trend is much smaller than the error 
of the ideal trend itself), especially when a ^ 0.8. Moreover, the estimated 
probability of the type I error (i.e. the probability of not selecting the first 
SVD component) is less than 0.05 for a ^ 1.4. All this allows us to conclude 
that in case of an exponential trend and a white Gaussian noise the proposed 
method of trend extraction with an optimal Cq with high probability selects 
the required first SVD component corresponding to the trend. 

The trend t„(Co^*) extracted with an optimal Cq estimates the true trend 

quite good when comparing the deviation yV (t„) with the noise stan- 

dard deviation a. For example, for a = 1.6 the value of yV (in) is 
approximately equal to 0.5. 

Note that for different a the mean errors n) are very similar 

though the used optimal values of Cq are quite different (Figure [3|) . This 
shows that the method adapts to the change of the model parameter a. 

Let us consider the dependence of inaccuracy of the proposed trend ex- 
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Square roots of the mean P-errors Co providing minimal I?(f„(Co), 




8.2 0.4 0.6 0.8 1 1.2 1.4 1.6 8.2 0.4 0.6 0.8 1 1.2 1.4 1.6 

G G 



Figure 3: The square roots of the mean errors (top left) 

^^*"(^n) (bottom left) and 'D{in) (bottom right) as well as the optimal 
Co value providing a minimal mean error V {tn, tn) between the extracted 
trend and the ideal SSA trend (top right); all for a = and a = 0.02. 

traction method on the value of Cq. As above, the inaccuracy is measured 

fYl.l.'tl. — 

with the minimal mean error V {tn,tn) between the extracted trend and 
the ideal SSA trend. Figure H] shows the graphs of this error as a function 
of Cq for different exponentials a and noise levels a. 



One can see that it is crucial not to select too large Cq since in this 
case the trend component can be not included in the reconstruction (that is 
also confirmed by the estimated probability of the type I error which is not 
reported here). At the same time without significant loss of accuracy one 
can choose Cq smaller than Cq^* (corresponding to the best accuracy). This 
is true due to the small contribution of each of noise components which can 
be erroneously included for Cq < Cq^^. 
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0.2 0.4 0.6 0.8 1 



Co 

Figure 4: The error I?(t„(Co), in) as a function of Co for different combina- 
tions of a = 0, 0.02 and a = 0.8, 1.4. 



5.2 Heuristic procedure for the choice of Cq 

Based on the observations of section I5.H we propose the following heuristic 
procedure for choosing the value of the method low- frequency threshold Cq. 

As discussed, trend EOFs vary slow. First we show that this property is 
inherited by the trend elementary reconstructed components, the time series 
components each reconstructed from one trend SVD component. 

Proposition 5.1. Let {\/^,U,V) be an eigentriple of SSA decomposition 
of a time series F, U = (ui, . . . , ul)"^ , V = (vi, . . . , vi)"^ , and G he a time 
series reconstructed by this eigentriple. If it is true that 

36i, 62 £ ^ ■ yk, 1 ^ k ^ L — 1 : \uk+i — u^l < Si, \vk+i — Vk\ < 62, 
then for the elements of G = ((70, • • • , 9n~i) the following holds: 

3e{5i,62): \/n, L* - I ^ n < K* : gn+i-Qn <e{5i,52), 
where L* = min{L,i^}, K* = max{L, K} . 

Proof of Proposition \5.1[ One can easily prove this proposition taking into 
account how the elementary reconstructed component G is constructed from 
its eigentriple {VX,U,V), see section [2j First, the matrix Y = \/^UV'^ is 
constructed. Second, the hankelization of Y is performed. 
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Let us show how to calculate e using ([2]) for 6i, 82 when L ^ K. For 
other cases 6(61,62) is calculated similarly. 



9n+i — Qn 



L 



m=l 



< 



Vx^. „ 

< —j- \Um\\Vn-m+3 — Vn-m+2\ < 



L 

m=l 



< ^62 \Um\ < 62^{ui + {L- l)5i). 
m=l 



□ 



Let us have a time series F and denote its trend extracted with the 
method with parameters ujq, Cq as T(a;o,Co). In order to propose the 
procedure selecting Cq, we first define the normalized contribution of low- 
frequency oscillations in the residual F — T{u)o,Co) as: 

■^F,a;o(Co) = C{F - T{uJo,Co),UJo)CiF,UJo)~^, 

where C is defined in equation ([5]) . 

Based on Proposition 15.11 we expect that the elementary reconstructed 
components corresponding to a trend have large contribution of low fre- 
quencies. Thus, the maximal values of Cq which lead to selection of trend- 
corresponding SVD components should generate jumps of TZF,uioi^o)- 

Exploiting this idea, we propose the following way of choosing Cq: 

Cf = minjCo G [0, 1] : 7^i.,^„(Co + AC) - 7^F,a.o(Co) > A7^}, (18) 

where AC is a search step and /S.1Z is the given threshold. On one hand, this 
strategy is heuristic and requires selection of ISJZ, but on the other hand, the 
simulation results and application to different time series showed its ability 
to choose reasonable Cq in many cases. Based on this empirical experience, 
we suggest using 0.05 < ATZ < 0.1. The step AC is to be chosen as small 
as possible to discriminate identifications occurring at different values of Cq. 
To reduce computational time, we commonly take AC > 0.01 and suggest a 
default value of AC = 0.01. 



6 EXAMPLES 

Simulated example with polynomial trend. The first example illus- 
trates the choice of parameters ujq and Cq. We simulated a time series of 
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Figure 5: Simulated example with a polynomial trend: original time series 
(top left); the original trend and an extracted one with L = 180, AC = 
0.01, and ATZ = 0.05 (top right); zoomed time series periodogram inside 
uj £ [0,0.25] (bottom left); the values of '7^f,wo(Co + AC) — TZf,u)o{Co) used 
for the choice of Co resulted in a value Cq^ = 0.53 (bottom right). 

length N = 300, shown in Figure El containing a polynomial trend, an 
exponentially-modulated sine wave, and a white Gaussian noise, whose n-th 
element is expressed as /„ = W'^^in - 10) (n - 70) (n - 160)2(n - 290)^ + 
exp(O.Oln) sin(27rn/12)+en, e„ is iidN{0, 5^). The period of the sine wave 
is assumed to be unknown. 

We have chosen the window length L = N/2 = 150 for achieving better 
separability of trend and residual. The value loq = Q/N = 0.02 was selected 
using (jlip . where the calculated median value is ~ 37.06. The search 
for Co using (fT8]l has been done with step AC = 0.01 and ATZ = 0.05. As 
shown in Figure [5l despite of the strong noise and oscillations, the extracted 
trend approximates the original one very well. The achieved mean square 
error is 0.79. For example, the ideal low pass filter with the cutoff frequency 
0.02 produced the error of 3.14. This superiority is achieved mostly due to 
better approximation at the first and last 50 points of the time series. All 
the calculations were performed using our Matlab-based software AutoSSA 
available at http://www.pdmi.ras.ru/~theo/autossa. 
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Figure 6: Unemployment level in Alaska: original data (left-hand side 
panel), zoomed periodogram (right-hand side panel). 

Trends of the unemployment level. Let us demonstrate extraction 
of trends of different scale. We took the data of the unemployment level 
(unemployed persons) in Alaska for the period 1976/01-2006/09 (monthly 
data, seasonally adjusted), provided by the Bureau of Labor Statistics at 
http://www.bls.gov under the identifier LASST02000004 (Figure [6]). This 
time series is typical for economical applications, where data contain rela- 
tively little noise and are subject to abrupt changes. Economists are often 
interested in the "short" term trend which includes cyclical fluctuations and 
is referred to as trend-cycle. 

The length of the data is = 369. For achieving better separability of 
trend and residual we selected L close to N/2 but divisible by the period 
T = 12 of probable seasonal oscillations: L = 12[A/24J = 180. 

We extracted trends of different scales using the following values of ujq: 
0.01, 0.02, 0.05, 0.075, and 0.095, see Figure El for the results. The value 
0.095 ~ [33/369 • 180] /180 was selected according to where ~ 

5.19 • 10^. The value 0.075 is the default value for monthly data (section S]) . 
Other values (0.01, 0.02 and 0.05) were considered for better illustration of 
how the value of ujq influences the scale of the extracted trend. The search 
for Co was performed as described in section [5] in the interval [0.5, 1] with 
the step AC = 0.01 and A7^ = 0.05. 
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Figure 7: Unemployment level in Alaska: extracted trends of different scales 
with ujQ = 0.01,0.02,0.05,0.075, and 0.095 (L = 180, AC = 0.01, and A7^ = 
0.05). 

7 CONCLUSIONS 

SSA is an attractive approach to trend extraction because it: (i) requires no 
model specification of time series and trend, (ii) extracts trend of noisy time 
series containing oscillations of unknown period. In this paper, we presented 
a method which inherits these properties and is easy to use since it requires 
selection of only two parameters. 
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